Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports
نویسندگان
چکیده
We describe and evaluate three systems for automatically predicting the ICD-9-CM codes of radiology reports from short excerpts of text. The first system benefits from an open source search engine, Lucene, and takes advantage of the relevance of reports to one another based on individual words. The second uses BoosTexter, a boosting algorithm based on n-grams (sequences of consecutive words) and s-grams (sequences of non-consecutive words) extracted from the reports. The third employs a set of hand-crafted rules that capture lexical elements (short, meaningful, strings of words) derived from BoosTexter's n-grams, and that are enhanced by shallow semantic information in the form of negation, synonymy, and uncertainty. Our evaluation shows that semantic information significantly contributes to ICD-9-CM coding with lexical elements. Also, a simple hand-crafted rule-based system with lexical elements and semantic information can outperform algorithmically more complex systems, such as Lucene and BoosTexter, when these systems base their ICD-9-CM predictions only upon individual words, n-grams, or s grams.
منابع مشابه
Automatic Code Assignment to Medical Text
Code assignment is important for handling large amounts of electronic medical data in the modern hospital. However, only expert annotators with extensive training can assign codes. We present a system for the assignment of ICD-9-CM clinical codes to free text radiology reports. Our system assigns a code configuration, predicting one or more codes for each document. We combine three coding syste...
متن کاملMIDAS: An Information-Extraction Approach to Medical Text Classification
This article describes MIDAS, an advanced expert system that is able to suggest medical diagnosis from the radiological/clinical patient records, based on information extraction and machine learning from clinical histories of previously diagnosed patients. MIDAS was designed to participate in the 2007 Medical Natural Language Processing Challenge. Specifically, it automates the assignment of IC...
متن کاملUnsupervised Extraction of Diagnosis Codes from EMRs Using Knowledge-Based and Extractive Text Summarization Techniques
Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient's ...
متن کاملAssociating Medical Concept Relations with ICD-9-CM Coding Rules
A simple method for mining interesting relations between concepts in a certain state is presented. Those relations should reveal same rules that a human coder uses while assigning ICD-9-CM billing codes. This study focuses only on three most occuring medical concepts in radiology dictations: cough, fever and pneumonia. An algorithm that tries to associate relation between two concepts in a cert...
متن کاملHandling Age Specification in the SNOMED CT to ICD-10-CM Cross-map
A SNOMED CT-encoded problem list will be required to satisfy the Certification Criteria for Stage 2 "Meaningful Use" of the EHR incentive program. ICD-10-CM will be replacing ICD-9-CM as the reimbursement code set in the near future. Having a cross-map from SNOMED CT to ICD-10-CM will promote the use of SNOMED CT as the primary problem list terminology, while easing the transition to ICD-10-CM....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره شماره
صفحات -
تاریخ انتشار 2007